An Improved Semi-Supervised Clustering Algorithm Based on Active Learning
نویسندگان
چکیده
In semi supervised clustering is one of the major tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized and the similarity of objects between clusters is minimized. The dataset sometimes may be in mixed nature that is it may consist of both numeric and categorical type of data. Naturally these two types of data may differ in their characteristics. Due to the differences in their characteristics in order to group these types of mixed data it is better to use the ensemble clustering method which uses split and merge approach to solve this problem. In this paper the original mixed dataset is splitted into numeric dataset and categorical dataset and clustered using both traditional clustering algorithms (K-Means and K-Modes) and fuzzy clustering algorithms (Fuzzy C-Means and Fuzzy C-Modes). The resultant clusters are combined using ensemble clustering methods and evaluated by both f-measure and entropy measure. It is found that splitting is more beneficial and applying fuzzy clustering algorithms yields better results than traditional clustering algorithms. KEY WORDS-Active learning; Clustering; Semi-supervised learning.
منابع مشابه
An Improved Semi-supervised Clustering Algorithm Based on Active Learning
In order to solve the difficult questions such as in the presence of the cluster deviation and high dimensional data processing in traditional semi-supervised clustering algorithm, a semi-supervised clustering algorithm based on active learning was proposed, this algorithm can effectively solve the above two problems. Using active learning strategies in algorithm can obtain a large amount of in...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملA confidence-based active approach for semi-supervised hierarchical clustering
Semi-supervised approaches have proven to be effective in clustering tasks. They allow user input, thus improving the quality of the clustering obtained, while maintaining a controllable level of user intervention. Despite being an important class of algorithms, hierarchical clustering has been little explored in semisupervised solutions. In this report, we address the problem of semi-supervise...
متن کاملAn Efficient Iterative Framework for Semi- Supervised Clustering Based Batch Sequential Active Learning Approach
Semi-supervised is the machine learning field. In the previous work, selection of pairwise constraints for semi-supervised clustering is resolved using active learning method in an iterative manner. Semi-supervised clustering derived from the pairwise constraints. The pairwise constraint depends on the two kinds of constraints such as must-link and cannot-link.In this system, enhanced iterative...
متن کامل